BASIC PYTHON FOR RESEARCHERS

by Megat Harun Al Rashid bin Megat Ahmad
last updated: April 14, 2016


6. Numerical Python

In Python, a list is considered as an array and operating on each element of the list can be done by accessing them sequentially (e.g. by using $for$ loop function).


In [1]:
# Without Numpy, to get the square root on a list of numbers:

from math import *

x1 = [23,12,45]
for y, z in enumerate(x1):
    Ex1 = sqrt(z)
    print Ex1


4.79583152331
3.46410161514
6.7082039325

With $NumPy$, elements in the list can be operated directly without the need to access them sequentially. This can be done by transforming the Python list into a $NumPy$ array. Python utilizes this $NumPy$ library as part of its fundamental package for scientific computing. Its major function is to allow easy operations especially on and between homogeneous multi-dimensional arrays. Available features in $NumPy$ can be accessed by importing the library.


In [2]:
from numpy import *

In [3]:
# With Numpy, to get the square root on a list of numbers:

# Creating the Python list
x1 = [23,12,45]

# Transforming the Python list into NumPy array
xnp = array(x1)

# Here we use the available NumPy square root function
Ex1 = sqrt(xnp)

print xnp
print Ex1


[23 12 45]
[ 4.79583152  3.46410162  6.70820393]

Elements of $NumPy$ array are not separated by comma when printed. They are separated only by space.

Performing mathematical operation between arrays requires part of these arrays to be of the same size (e.g. 5 x 10 array can be multiplied with another 5 x 10 array or 5 x 1 array). This will be explored further).


In [4]:
# Array operation on real and imaginary parts to obtain complex conjugate 

i = array([12.7,9.3,0.8])
j = array([4.5j,2.7j,3.1j])

print (i + j)*(i + j).conjugate()


[ 181.54+0.j   93.78+0.j   10.25+0.j]

6.1 Features of NumPy Array

Creating an array using $NumPy$ can be done in several ways. In all the following examples, $NumPy$ is imported as $np$. Therefore $NumPy$ can be initially accessed by using $np.\{function\}$. This allows the separate usage of similar wordings $NumPy$ and Python $math$ function like $sqrt$. It is also possible to use $NumPy$ function names directly that have no equivalent in Python functions.


In [5]:
import numpy as np

# Creating a one-dimensional array with sequential integer values

a = np.arange(15)
a


Out[5]:
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

This create a one dimensional array with 15 elements starting from 0 to 14.

The range and increment of the array elements during array creation can be specified by using the (min, max, add) arguments in the $arange$ function.


In [6]:
'''Below means that the g array is going to contains range of elements with minimum
value of 10 (inclusive of 10) up to a maximum value of 50 (exclusive of or less than 50) and with
an increment of 5'''

g = np.arange(10,50,5)     
print g


[10 15 20 25 30 35 40 45]

In [7]:
# To include 50, just set 50 to 51

g = np.arange(10,51,5)     
print g


[10 15 20 25 30 35 40 45 50]

Another useful array creating function is $linspace$ which allows specifying (inclusive both) the minimum and maximum values and the number of array elements.


In [8]:
x = np.linspace(0,2*np.pi,20)         # note: np.pi is used instead of just pi
print x


[ 0.          0.33069396  0.66138793  0.99208189  1.32277585  1.65346982
  1.98416378  2.31485774  2.64555171  2.97624567  3.30693964  3.6376336
  3.96832756  4.29902153  4.62971549  4.96040945  5.29110342  5.62179738
  5.95249134  6.28318531]

In [9]:
# Checking the inclusivity
print x.min()
print x.max()
print x.size


0.0
6.28318530718
20

To create a higher dimensional array, we can use any one of these functions and then use the $reshape$ function. Let reshapes variable a into a two dimensional 3 x 5 array.


In [10]:
# Creating a two-dimensional 3 x 5 array from b 

b = a.reshape(3,5)
b


Out[10]:
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

Shaping a three dimensional 3 x 3 x 3 array:


In [11]:
c = np.arange(27).reshape(3,3,3)
print c


[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]]

We can check the shape, dimension and size of these arrays, as well as the mean and median of the arrays.


In [12]:
print c.shape        # shape of array c
print b.ndim         # dimension of array b
print a.size         # size of array a or the number of elements in array a


(3, 3, 3)
2
15

In [13]:
# mean value for the c array
np.mean(c)


Out[13]:
13.0

In [14]:
# mean value for b array elements in each column
np.mean(b, axis=0)


Out[14]:
array([ 5.,  6.,  7.,  8.,  9.])

In [15]:
# mean value for b array elements in each row
np.mean(b, axis=1)


Out[15]:
array([  2.,   7.,  12.])

In [16]:
# average value for a all array elements
np.average(a)


Out[16]:
7.0

In [17]:
# sum values for c array elements in axis 2
np.sum(c,axis=2)


Out[17]:
array([[ 3, 12, 21],
       [30, 39, 48],
       [57, 66, 75]])

Earlier in this tutorial, a simple mathematical operations for complex number array has been shown. $NumPy$ arrays can only be operated with each other when the arrays are of the same size and dimension.


In [18]:
# Finding the sin^2(x) + cos^2(x) (trigonometric identity)

np.sin(x)**2 + np.cos(x)**2         # note: np.sin and np.cos are used instead of just sin and cos


Out[18]:
array([ 1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [19]:
z = np.linspace(1,5.347,20)

In [20]:
x*z         # note:  arrays of the same size and dimension


Out[20]:
array([  0.        ,   0.40635326,   0.96402512,   1.67301557,
         2.53332462,   3.54495226,   4.7078985 ,   6.02216334,
         7.48774677,   9.1046488 ,  10.87286942,  12.79240864,
        14.86326646,  17.08544287,  19.45893787,  21.98375147,
        24.65988367,  27.48733446,  30.46610385,  33.59619184])

In [21]:
a1 = np.linspace(1,8.75,12).reshape(3,4)       # a1 is 3 x 4 array
a1


Out[21]:
array([[ 1.        ,  1.70454545,  2.40909091,  3.11363636],
       [ 3.81818182,  4.52272727,  5.22727273,  5.93181818],
       [ 6.63636364,  7.34090909,  8.04545455,  8.75      ]])

In [22]:
b1 = np.linspace(1,4,4)        # b1 is 1 x 4 array
b1


Out[22]:
array([ 1.,  2.,  3.,  4.])

In [23]:
a1*b1


Out[23]:
array([[  1.        ,   3.40909091,   7.22727273,  12.45454545],
       [  3.81818182,   9.04545455,  15.68181818,  23.72727273],
       [  6.63636364,  14.68181818,  24.13636364,  35.        ]])

It is possible to operate this a1 with b1 as one of part of the array (the column) is equivalent in size.

array row column
a1 $$3$$ $$4$$
b1 $$1$$ $$4$$

Let c1 be a 1 x 3 array:


In [24]:
c1 = np.linspace(1,3,3)
c1


Out[24]:
array([ 1.,  2.,  3.])

In [25]:
a1*c1


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-7559c13dd0cf> in <module>()
----> 1 a1*c1

ValueError: operands could not be broadcast together with shapes (3,4) (3,) 

The interpreter produces an error as neither row nor column of a1 is equivalent to c1.

array row column
a1 $$3$$ $$4$$
c1 $$1$$ $$3$$

It is still possible to operate a1 with c1. This can be done by transposing a1 (a1 now becomes 4 x 3 array, so the column size is the same as c1 column size).


In [26]:
a1.T*c1


Out[26]:
array([[  1.        ,   7.63636364,  19.90909091],
       [  1.70454545,   9.04545455,  22.02272727],
       [  2.40909091,  10.45454545,  24.13636364],
       [  3.11363636,  11.86363636,  26.25      ]])

Other type of arrays are array with ones and zeroes. These are flexible arrays and can be used for many purposes e.g. indexing collision particles in two dimensions. There is also the $mgrid$ function that is similar to meshgrid in MATLAB.


In [27]:
y = np.ones((10,10))
print y
z = np.zeros((5,5))
print z


[[ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]]
[[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]]

In [28]:
i, j = np.mgrid[0:4, 0:4]       # similar to meshgrid in MATLAB

In [29]:
i


Out[29]:
array([[0, 0, 0, 0],
       [1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3]])

In [30]:
j


Out[30]:
array([[0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3],
       [0, 1, 2, 3]])

Exercise 6.1: Create this 2-dimensional array using NumPy:

$$\left[ \begin{array}{cc} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0\\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0\end{array} \right]$$

In [31]:
m, n = np.mgrid[0:10, 0:10]
(m > n)*1


Out[31]:
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 0]])

$NumPy$ array can be sliced, extracted and reassigned similar to the operations on Python sequences.


In [32]:
# Slicing
x = np.linspace(0,2*np.pi,20)
print x
print x[3:5]


[ 0.          0.33069396  0.66138793  0.99208189  1.32277585  1.65346982
  1.98416378  2.31485774  2.64555171  2.97624567  3.30693964  3.6376336
  3.96832756  4.29902153  4.62971549  4.96040945  5.29110342  5.62179738
  5.95249134  6.28318531]
[ 0.99208189  1.32277585]

In [33]:
y = np.arange(15).reshape(3,5)
print y
print y[1]                   # the whole row of index 1


[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
[5 6 7 8 9]

In [34]:
z = y[0:2,1:4]               # assigning to variable
z


Out[34]:
array([[1, 2, 3],
       [6, 7, 8]])

In [35]:
y[0:2,1:4] = 0               # reassigned
y


Out[35]:
array([[ 0,  0,  0,  0,  4],
       [ 5,  0,  0,  0,  9],
       [10, 11, 12, 13, 14]])

In [36]:
# reassigned with new array
y[0:2,1:4] = np.array([[45,89,12],[12,73,69]])       
y


Out[36]:
array([[ 0, 45, 89, 12,  4],
       [ 5, 12, 73, 69,  9],
       [10, 11, 12, 13, 14]])

There are many other ways to create $NumPy$ array, some are quite fancy e.g.:


In [37]:
X = np.array([n**np.e for n in range(5)])       # note: np.e represent exponential constant
X


Out[37]:
array([  0.        ,   1.        ,   6.58088599,  19.81299075,  43.30806043])

In [38]:
# i is row index and j is column index
W = np.array([[j+i*3 for j in range(3)] for i in range(3)])
W


Out[38]:
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

6.2 File Input/Output

$NumPy$ array can be saved into a binary file which later can be loaded. 1-D and 2-D arrays can also be saved in text format.


In [39]:
'''Let us build a 3-dimensional 5 x 5 x 5 array 
containing elements created using random number'''

rand_array = np.random.rand(4,4,4)
rand_array


Out[39]:
array([[[ 0.16288574,  0.34512979,  0.42734996,  0.69323916],
        [ 0.12593153,  0.10621008,  0.15867052,  0.39990068],
        [ 0.66035874,  0.80057956,  0.81382715,  0.03606962],
        [ 0.50402251,  0.12622572,  0.39062081,  0.66417067]],

       [[ 0.31397343,  0.16151298,  0.18394905,  0.58534405],
        [ 0.16391213,  0.57070191,  0.40637193,  0.09734607],
        [ 0.23650067,  0.51281238,  0.77877061,  0.78160142],
        [ 0.24272477,  0.41665678,  0.73101319,  0.60756296]],

       [[ 0.83785594,  0.19462827,  0.41507672,  0.92640519],
        [ 0.41844881,  0.22444927,  0.05306902,  0.7033712 ],
        [ 0.43892731,  0.66920604,  0.96161724,  0.24994796],
        [ 0.36591196,  0.97106874,  0.25035091,  0.01183044]],

       [[ 0.80726725,  0.68726225,  0.58878418,  0.96129402],
        [ 0.63339747,  0.21223943,  0.21183456,  0.7977489 ],
        [ 0.75760042,  0.25650812,  0.76265736,  0.55705083],
        [ 0.97963256,  0.91662955,  0.41953759,  0.04472739]]])

(The default random number generator algorithm is Mersenne-Twister)


In [40]:
np.save("Tutorial6/random_3D_array.npy", rand_array)

In [41]:
np.load("Tutorial6/random_3D_array.npy")


Out[41]:
array([[[ 0.16288574,  0.34512979,  0.42734996,  0.69323916],
        [ 0.12593153,  0.10621008,  0.15867052,  0.39990068],
        [ 0.66035874,  0.80057956,  0.81382715,  0.03606962],
        [ 0.50402251,  0.12622572,  0.39062081,  0.66417067]],

       [[ 0.31397343,  0.16151298,  0.18394905,  0.58534405],
        [ 0.16391213,  0.57070191,  0.40637193,  0.09734607],
        [ 0.23650067,  0.51281238,  0.77877061,  0.78160142],
        [ 0.24272477,  0.41665678,  0.73101319,  0.60756296]],

       [[ 0.83785594,  0.19462827,  0.41507672,  0.92640519],
        [ 0.41844881,  0.22444927,  0.05306902,  0.7033712 ],
        [ 0.43892731,  0.66920604,  0.96161724,  0.24994796],
        [ 0.36591196,  0.97106874,  0.25035091,  0.01183044]],

       [[ 0.80726725,  0.68726225,  0.58878418,  0.96129402],
        [ 0.63339747,  0.21223943,  0.21183456,  0.7977489 ],
        [ 0.75760042,  0.25650812,  0.76265736,  0.55705083],
        [ 0.97963256,  0.91662955,  0.41953759,  0.04472739]]])

In [42]:
rand2_array = np.random.rand(4,4)

The coming examples show how to save array into text file.


In [43]:
np.savetxt("Tutorial6/random_2D_array.txt", rand2_array, delimiter=',')

In the above example, the array _rand2array is saved as comma separated values by specifying comma as the delimiter used.


In [44]:
fh_csv = open("Tutorial6/random_2D_array.txt")
fh_csvList = fh_csv.readlines()
fh_csvList


Out[44]:
['9.396302828156297693e-01,4.331645893110744883e-01,9.239207829088839086e-01,2.234272506892963639e-01\n',
 '7.538477047974018186e-01,8.772528705985704889e-01,7.136011741989152224e-01,8.036578489765119349e-01\n',
 '5.601472051284052123e-01,2.762968576279914990e-01,6.366152572987821001e-01,1.621138306086000735e-02\n',
 '1.981689884819235470e-01,5.157183378274130536e-01,7.899071384294774623e-01,5.171718940133459563e-01\n']

It is possible to specify or formatted the decimal value when saving a text file.


In [45]:
np.savetxt("Tutorial6/random_2D_arrayF.txt", rand2_array, \
           delimiter=',', fmt='%.3f')

In [46]:
fh_csv1 = open("Tutorial6/random_2D_arrayF.txt")
fh_csv1List = fh_csv1.readlines()
fh_csv1List


Out[46]:
['0.940,0.433,0.924,0.223\n',
 '0.754,0.877,0.714,0.804\n',
 '0.560,0.276,0.637,0.016\n',
 '0.198,0.516,0.790,0.517\n']

Similar with binary file, a text file can be loaded as an array with delimiter used specified.


In [47]:
fromtxtFile = np.loadtxt("Tutorial6/random_2D_arrayF.txt", delimiter=',')

In [48]:
fromtxtFile


Out[48]:
array([[ 0.94 ,  0.433,  0.924,  0.223],
       [ 0.754,  0.877,  0.714,  0.804],
       [ 0.56 ,  0.276,  0.637,  0.016],
       [ 0.198,  0.516,  0.79 ,  0.517]])

6.3 Matrix Operations

What we have seen previously were scalar array operations. $NumPy$ array can also undergoes matrix operations. $NumPy$ has special functions for matrix operations.


In [49]:
# Creating two arrays with random numbers, M1 and M2
M1 = np.random.rand(3,3)*np.linspace(1,36,9).reshape(3,3)
M2 = np.random.rand(3,3)*np.linspace(1,2*np.e,9).reshape(3,3)
M1, M2


Out[49]:
(array([[  0.85334341,   1.52157177,   5.74146348],
        [  2.18733933,  17.0419147 ,  10.15834702],
        [  0.04433136,   1.07833879,   0.07984334]]),
 array([[ 0.98875989,  0.80478647,  1.39807629],
        [ 2.2324394 ,  0.87968643,  1.22664058],
        [ 1.67391861,  0.79993591,  4.3545523 ]]))

In [50]:
# dot product
np.dot(M1,M2)


Out[50]:
array([[ 13.85131108,   6.6180681 ,  28.0609639 ],
       [ 57.21204141,  24.87790884,  68.19742486],
       [  2.58481034,   1.04814684,   1.73239476]])

In [51]:
N1 = np.arange(2,8,2)
N1


Out[51]:
array([2, 4, 6])

In [52]:
np.dot(M1,N1)


Out[52]:
array([  42.2417548 ,  133.49241959,    4.88107794])

In [53]:
np.dot(N1,N1)


Out[53]:
56

Standard arithmetic operators (+, -, *) can be used for matrix operations by casting the $NumPy$ array as of type matrix.


In [54]:
W1 = np.matrix(M1)
W1


Out[54]:
matrix([[  0.85334341,   1.52157177,   5.74146348],
        [  2.18733933,  17.0419147 ,  10.15834702],
        [  0.04433136,   1.07833879,   0.07984334]])

In [55]:
Y = np.matrix(N1).T  
Y


Out[55]:
matrix([[2],
        [4],
        [6]])

In [56]:
W2 = np.matrix(M2)

W1*W2                # instead of using np.dot() function


Out[56]:
matrix([[ 13.85131108,   6.6180681 ,  28.0609639 ],
        [ 57.21204141,  24.87790884,  68.19742486],
        [  2.58481034,   1.04814684,   1.73239476]])

Matrix operations must adhered to standard matrix agebra and it is good practice to check the shape of the matrices before any operations.


In [57]:
np.shape(W1), np.shape(Y)


Out[57]:
((3, 3), (3, 1))

In [58]:
Y + W1*Y


Out[58]:
matrix([[  44.2417548 ],
        [ 137.49241959],
        [  10.88107794]])

A few others mathematical functions on matrix include:


In [59]:
# Finding determinant

np.linalg.det(W1)


Out[59]:
1.437691195811446

In [60]:
# Inverse matrix

np.linalg.inv(W1)


Out[60]:
matrix([[ -6.67282116,   4.22187702, -57.30637923],
        [  0.19175805,  -0.12964752,   2.70570652],
        [  1.11512236,  -0.59313151,   7.80029246]])

In [61]:
np.linalg.inv(W1)*W1


Out[61]:
matrix([[  1.00000000e+00,  -1.42108547e-14,  -3.55271368e-15],
        [  5.55111512e-17,   1.00000000e+00,   2.49800181e-16],
        [ -5.55111512e-17,   0.00000000e+00,   1.00000000e+00]])

This tutorial part just shows the small fraction on $NumPy$ capabilities so that users will familiarize with the syntax. More on $NumPy$ can be found on: